Picture for Fanqi Wan

Fanqi Wan

DockSmith: Scaling Reliable Coding Environments via an Agentic Docker Builder

Add code
Jan 31, 2026
Viaarxiv icon

STEP3-VL-10B Technical Report

Add code
Jan 15, 2026
Viaarxiv icon

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Add code
Jan 09, 2026
Viaarxiv icon

ProactiveEval: A Unified Evaluation Framework for Proactive Dialogue Agents

Add code
Aug 28, 2025
Viaarxiv icon

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Add code
May 23, 2025
Viaarxiv icon

QwenLong-CPRS: Towards $\infty$-LLMs with Dynamic Context Optimization

Add code
May 23, 2025
Viaarxiv icon

SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization

Add code
May 16, 2025
Viaarxiv icon

FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion

Add code
Apr 09, 2025
Viaarxiv icon

FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion

Add code
Mar 06, 2025
Viaarxiv icon

Advantage-Guided Distillation for Preference Alignment in Small Language Models

Add code
Feb 25, 2025
Viaarxiv icon